Static and Dynamic Processor Allocation for Higher-Order Concurrent Languages
نویسندگان
چکیده
ion of the non-negative natural numbers. The correctness of the complete analysis then follows from the subject reduction result of [13] that allows us to lift safety (as opposed to liveness) results from the behaviours to safety results for CML programs. We also address the implementation of the second stage of the analysis. Here the idea is to transform the problem as speci ed by the syntax-directed inference system into a syntax-free equation solving problem where standard techniques from data ow analysis can be used to obtain fast implementations. (As already mentioned the implementation of the rst stage is the topic of [14, 1].) Comparison with other work. First we want to stress that our approach to processor allocation is that of static program analysis rather than, say, heuristics based on pro ling as is often found in the literature on implementation of concurrent languages. In the literature there are only few program analyses for combined functional and concurrent languages. An extension of SML with Linda communication primitives is studied in [3] and, based on the corresponding process algebra, an analysis is presented that provides useful information for the placement of processes on a nite number of processors. A functional language with communication via shared variables is studied in [9] and its communication patterns are analysed, again with the goal of producing useful information for processor (and storage) allocation. Also a couple of program analyses have been developed for concurrent languages with an imperative facet. The papers [4, 8, 15] all present reachability analyses for concurrent programs with a statically determined communication topology; only [15] shows how this restriction can be lifted to allow communication in the style of the -calculus. Finally, [11] presents an analysis determining the number of communications on each channel connecting two processes in a CSP-like language. As mentioned our analysis is speci ed in two stages. The rst stage is formalised in [13, 14]; similar considerations were carried out by Havelund and Larsen leading to a comparable process algebra [6] but with no formal study of the link to CML nor with any algorithm for automatically extracting behaviours. The same overall idea is present in [3] but again with no formal study of the link between the process algebra and the programming language. The second stage of the analysis extracts much more detailed information from the behaviours and this leads to a much more complex notion of correctness than in [13]. Furthermore, the analysis is parameterised on the choice of value space thereby incorporating ideas from abstract interpretation. 2 Behaviours Full details of the syntax of CML are not necessary for the developments of the present paper. It will su ce to introduce a running example and to use it to motivate the process algebra of CML.272 Example 2.1 Suppose we want to de ne a program pipe [f1,f2,f3] in out that constructs a pipeline of processes: the sequence of inputs is taken over channel in, the sequence of outputs is produced over channel out and the functions f1, f2, f3 (and the identity function id de ned by fn x => x) are applied in turn. To achieve concurrency we want separate processes for each of the functions f1, f2, f3 (and id). This system might be depicted graphically as follows: f1? f2? f3 ? id? in ch1 ch2 ch3 out fail fail fail fail Here ch1, ch2, and ch3 are new internal channels for interconnecting the processes; and fail is a channel over which failure of operation may be reported. Taking the second process as an example it may be created by the CML expression node f2 ch1 ch2 where the function node is given by fn f => fn in => fn out => fork (rec loop d => sync (choose [wrap (receive in, fn x => sync (send (out, f x)); loop d), send(fail,())])) Here f is the function to be applied, in is the input channel and out is the output channel. The function fork creates a new process labelled that performs as described by the recursive function loop that takes the dummy parameter d. In each recursive call the function may either report failure by send(fail,()) or it may perform one step of the processing: receive the input by means of receive in, take the value x received and transmit the modi ed value f x by means of send(out,f x) after which the process repeats itself by means of loop d. The primitive choose allows to perform an unspeci ed choice between the two communication possibilities and wrap allows to modify a communication by postprocessing the value received or transmitted. The sync primitive enforces synchronisation at the right points and we refer to [16] for a discussion of the language design issues involved in this; once we have arrived at the process algebra such considerations will be of little importance to us. The overall construction of the network of processes is then the task of the pipe function de ned by rec pipe fs => fn in => fn out => if isnil fs then node (fn x => x) in out else let ch = channel () in (node (hd fs) in ch; pipe (tl fs) ch out) 273 Here fs is the list of functions to be applied, in is the input channel, and out is the output channel. If the list of functions is empty we connect in and out by means of a process that applies the identity function; otherwise we create a new internal channel by means of channel () and then we create the process for the rst function in the list and then recurse on the remainder of the list. The process algebra of CML [13] allows to give succinct representations of the communications taking place in CML programs. The terms of the process algebra are called behaviours, denoted b 2 Beh, and are given by b ::= j L!t j L?t j t chanL j j forkL b j b1; b2 j b1 + b2 j rec : b where L Labels is a non-empty and nite set of program labels. The behaviour is associated with the pure functional computations of CML. The behaviours L!t and L?t are associated with sending and receiving values of type t over channels with label in L, the behaviour t chanL is associated with creating a new channel with label in L and over which values of type t can be communicated, and the behaviour forkL b is associated with creating a new process with behaviour b and with label in L. Together these behaviours constitute the atomic behaviours, denoted p 2 ABeh, as may be expressed by setting p ::= j L!t j L?t j t chanL j forkL b Finally, behaviours may be composed by sequencing (as in b1; b2) and internal choice (as in b1 + b2) and we use behaviour variables together with an explicit rec construct to express recursive behaviours. The structure of the types, denoted t 2 Typ, shall be of little concern to us in this paper and we shall therefore leave it mostly unspeci ed (but see [13]); however, we need to state that chanL is the type of a channel with label in L over which elements of type may be communicated. Since types might conceivably contain behaviours the notion of free variables needs to be replaced by a notion of exposed variables: we shall say that a behaviour variable is exposed in a behaviour b if it has a free occurrence that is not a subterm of any type mentioned in b. Example 2.2 Assuming that fail is a channel of type unit chanL the type inference system of [13] can be used to prove that pipe has type ( ! ) list ! chanL1 ! chanL2 !b unit where b is rec 0:(fork (rec 00:(L1? ; ;L2! ; 00 + L!unit)) + chanL1 ; fork (rec 00:(L1? ; ;L2! ; 00 + L!unit)); 0) Thus the behaviour expresses directly that the pipe function is recursively dened and that it either spawns a single process or creates a channel, spawns a process and recurses. The spawned processes will all be recursive and they will either report failure over a channel in L and terminate, or else input over a channel in L1, do something (as expressed by and ), output over a channel in L2 and recurse. 274 The semantics of behaviours is de ned by a transition relation of the form PB =)aps PB0 where PB and PB0 are mappings from process identi ers to closed behaviours and the special symbol p denoting termination. Furthermore, a is an action that takes place and ps is a list of the processes that take part in the action. The actions rather closely correspond to atomic behaviours and are given by a ::= j L!t?L j t chanL j forkL b If the transition PB =)aps PB0 has a = this means that one of the behaviours in PB performed some internal computation that did not involve communication; in other words it performed the atomic behaviour . If a = L!t?L this means that two disctinct behaviours performed a communication: one performed the atomic behaviour L!t and the other the atomic behaviour L?t. Finally if a = chanL or a = forkL this means that one of the behaviours in PB allocated a new channel or forked a new process. Since we have covered all possibilities of atomic behaviours we have also covered all possibilities of actions. We refer to [13] for the precise details of the semantics as these are of little importance for the development of the analyses. 3 Value Spaces In the analyses we want to predict the number of times certain events may happen. The precision as well as the complexity of the analyses will depend upon how we count so we shall parameterise the formulation of the analyses on our notion of counting. This amounts to abstracting the non-negative integers N by a complete lattice (Abs, v). As usual we write ? for the least element, > for the greatest element, F and t for least upper bounds by a function and u for greatest lower bounds. The abstraction is expressed R : N!m Abs that is strict (has R(0) = ?) and monotone (has R(n1) v R(n2) whenever n1 n2); hence the ordering on the natural numbers is reected in the abstract values. Three elements of Abs are of particular interest and we shall introduce special syntax for them: o = R(0) = ? i = R(1) m = > We cannot expect our notion of counting to be precisely re ected by Abs; indeed it is likely that we shall allow to identify for example R(2) and R(3) and perhaps even R(1) and R(2). However, we shall ensure throughout that no identi cations involve R(0) by demanding that R 1(o) = f0g so that o really represents \did not happen". We shall be interested in two binary operations on the non-negative integers. One is the operation of maximum: maxfn1; n2g is the larger of n1 and n2. In Abs we shall use the binary least upper bound operation to express the maximum operation. Indeed R(maxfn1; n2g) = R(n1)t R(n2) holds by 275 monotonicity of R as do the laws n1 v n1tn2, n2 v n1tn2 and ntn = n. As a consequence n1tn2 = o i both n1 and n2 equal o. The other operation is addition: n1 + n2 is the sum of n1 and n2. In Abs we shall have to de ne a function and demand that (Abs, , o) is an Abelian monoid with monotone. This ensures that we have the associative law n1 (n2 n3) = (n1 n2) n3, the absorption laws n o = o n = n, the commutative law n1 n2 = n2 n1 and by monotonicity we have also the laws n1 v n1 n2 and n2 v n1 n2. As a consequence n1 n2 = o i both n1 and n2 equal o. To ensure that models addition on the integers we impose the condition 8n1; n2: R(n1+n2) v R(n1) R(n2) that is common in abstract interpretation. De nition 3.1 A value space is a structure (Abs, v, o, i, m, , R) as detailed above. It is an atomic value space if i is an atom (that is o v n v i implies that o = n or i = n). Example 3.2 One possibility is to use A3 = fo; i;mg and de ne v by o v i v m. The abstraction function R will then map 0 to o, 1 to i and all other numbers to m. The operations t and can then be given by the following tables: t o i m o o i m i i i m m m m m o i m o o i m i i m m m m m m This de nes an atomic value space. For two value spaces (Abs0, v0, o0, i0, m0, 0, R0) and (Abs00, v00, o00, i00, m00, 00, R00) we may construct their cartesian product (Abs, v, o, i, m, , R) by setting Abs = Abs0 Abs00 and by de ning v, o, i, m, and R componentwise. This de nes a value space but it is not atomic even if Abs0 and Abs00 both are. As a consequence i = (i0; i00) will be of no concern to us; instead we use (o0; i00) and (i0;o00) as appropriate. For a value space (Abs0, v0, o0, i0, m0, 0, R0) and a non-empty set E of events we may construct the indexed value space (or function space) (Abs, v, o, i, m, , R) by setting Abs = E ! Abs0 (the set of total functions from E to Abs0) and by de ning v, o, i, m, and R componentwise. This de nes a value space that is almost never atomic; as a consequence i = e:i0 will be of no concern to us. For indexed value spaces we may represent (f 2 E ! Abs) by (rep(f) 2 E ,! Absnfog) where E ,! Absnfog denotes the set of partial functions from E toAbsnfog; here rep(f) maps e to n i f(e) = n and n 6= o. In practice we want to restrict E to be a nite set in order to obtain nite representations; we write (f 2 E !f Abs) to indicate that f is o on all but a nite number of arguments so that such a representation is possible. 4 Counting the Behaviours For a given behaviour b and value space Abs we may ask the following four questions: 276 benv ` : [ ] benv ` L!t : [L 7! (o;o; i;o)] benv ` L?t : [L 7! (o; i;o;o)] benv ` t chanL : [L 7! (i;o;o;o)] benv ` b : A benv ` forkL b : [L 7! (o;o;o; i)] A benv ` b1 : A1 benv ` b2 : A2 benv ` b1; b2 : A1 A2 benv ` b1 : A1 benv ` b2 : A2 benv ` b1 + b2 : A1tA2 benv[ 7! A] ` b : A benv ` rec : b : A benv ` : A if benv( ) = A Table 1: Analysis of behaviours how many times are channels labelled by L created? how many times do channels labelled by L participate in input? how many times do channels labelled by L participate in output? and how many times are processes labelled by L generated? To answer these questions we de ne an inference system with formulae benv ` b : A where LabSet = Pf (Labels) is the set of nite and non-empty subsets of Labels and A 2 LabSet!f Abs records the required information. In this section we shall de ne the inference system for answering all four questions simultaneously. Hence we let Abs be the four-fold cartesian product Ab4 of an atomic value space Ab; we shall leave the formulation parameterised on the choice of Ab but a useful candidate is the three-element value space A3 of Example 3.2 and this will be the choice in all examples. The idea is that A(L) = (nc; ni; no; nf) means that channels labelled by L are created at most nc times, that channels labelled by L participate in at most ni input operations, that channels labelled by L participate in at most no output operations, and that processes labelled by L are generated at most nf times. The behaviour environment benv then associates each behaviour variable with an element of LabSet !f Abs. The analysis is de ned in Table 1. We use [ ] as a shorthand for L:(o;o;o;o) and [L 7! ~n] as a shorthand for L0:( (o;o;o;o) if L0 6= L ~n if L0 = L ). Note that i denotes the designated \one"-element in each copy of Ab since it is the atoms (i;o;o;o), (o; i;o;o), (o;o; i;o), and (o;o;o; i) that are useful for increasing the count. In the rule for forkL we are deliberately incorporating the e ects of the forked process; to avoid doing so simply remove the \ A" component. The rules for sequencing, choice, and behaviour variables are straightforward given the developments of the previous section. Note that the rule for recursion expresses a xed point property and so allows some slackness; it would be inelegant to specify a least (or greatest) 277 xed point property whereas a postxed point1 could easily be accomodated by incorporating a notion of subsumption into the rule. We decided not to incorporate a general subsumption rule and to aim for specifying as unique results as the rule for recursion allows. Example 4.1 For the pipe function of Examples 2.1 and 2.2 the analysis will give the following information (read \m" as \many"): L1: m channels created and m inputs performed L2: m outputs performed L: m outputs performed : m processes created While this is evidently correct it also seems pretty uninformative; yet we shall see that this simple analysis su ces for developing more informative analyses for static and dynamic processor allocation. To formally express the correctness of the analysis we need a few de nitions. Given a list X of actions de ne: COUNT(X) = L:(CC(X;L); CI(X;L); CO(X;L); CF (X;L)) CC(X;L): the number of elements of the form t chanL in X , CI(X;L): the number of elements of the form L0!t?L in X , CO(X;L): the number of elements of the form L!t?L0 in X , and CF (X;L): the number of elements of the form forkL b in X . The formal version of our explanations above about the intentions with the analysis then amounts to the following soundness result: Theorem 4.2 If ; ` b : A and [pi0 7! b] =)a1 ps1 : : :=)ak psk PB then we have R (COUNT[a1; ; ak]) v A. where R (C)(L) = (R(c);R(i);R(o);R(f)) if C(L) = (c; i; o; f). 5 Implementation It is well-known that compositional speci cations of program analyses (whether as abstract interpretations or annotated type systems) are not the most e cient way of obtaining the actual solutions. We therefore demonstrate how the inference problem may be transformed to an equation solving problem that is independent of the syntax of our process algebra and where standard algorithmic techniques may be applied. This approach also carries over to the inference systems for processor allocation developed subsequently. The rst step is to generate the set of equations. To show that this does not a ect the set of solutions we shall be careful to avoid undesirable \crossover" between equations generated from disjoint syntactic components of the behaviour. One possible cause for such \cross-over" is that behaviour variables 1We take a postxed point of a function f to be an argument n such that f(n) v n. 278 E [[B : $ : ]] = fh$i = [ ] g E [[B : $ : L!t]] = fh$i = [L 7! (o;o; i;o)] g E [[B : $ : L?t]] = fh$i = [L 7! (o; i;o;o)] g E [[B : $ : t chanL]] = fh$i = [L 7! (i;o;o;o)] g E [[B : $ : forkL b]] = fh$i = [L 7! (o;o;o; i)] h$1i g [ E [[B : $1 : b]] E [[B : $ : b1; b2]] = fh$i = h$1i h$2i g [ E [[B : $1 : b1]] [ E [[B : $2 : b2]] E [[B : $ : b1 + b2]] = fh$i = h$1ith$2i g [ E [[B : $1 : b1]] [ E [[B : $2 : b2]] E [[B : $ : ]] = fh$i = h i g E [[B : $ : rec : b]] = CLOSE$ ( fh$i = h$1i; h$i = h i g [ E [[B : $1 : b]] ) Table 2: Constructing the equation system may be bound in more than one rec; one classical solution to this is to require that the overall behaviour be alpha-renamed such that this does not occur; the solution we adopt avoids this requirement by suitable modi cation of the equation system. Another possible cause for \cross-over" is that disjoint syntactic components of the overall behaviour may nonetheless have components that syntactically appear the same; we avoid this problem by the standard use of tree-addresses (denoted $). The function E for generating the equations for the overall behaviour B achieves this by the call E [[B : " : b]] where " denotes the empty tree-address. In general B : $ : b indicates that the subtree of B rooted at $ is of the form b and the result of E [[B : $ : b]] is the set of equations produced for b. The formal de nition is given in Table 2. The key idea is that E [[B : $ : b]] operates with ow variables of the form h$0i and h 0i. We maintain the invariant that all $0 occurring in E [[B : $ : b]] are (possibly empty) prolongations of$ and that all 0 occurring in E [[B : $ : b]] are exposed in b. To maintain this invariant in the case of recursion we de ne CLOSE$ (E) = f (L[h$i=h i] = R[h$i=h i]) j (L = R) 2 E g (although it would actually su ce to apply the substitution [h$i=h i] on the righthand sides of equations and it would be correct to remove the trivial equation produced). Terms of the equations are formal terms over the ow variables (that range over the complete lattice LabSet ! Abs), the operations and t and the constants (that are elements of the complete lattice LabSet ! Abs). Thus all terms are monotonic in their free ow variables. A solution to a set E of equations is a partial function from ow variables to LabSet ! Abs such that all ow variables in E are in the domain of and such that all equations (L = R) of E have (L) = (R) where is extended to formal terms in the obvious way. We write j= E whenever this is the case. Theorem 5.1 [ ] ` b : A i 9 : j= E [[b : " : b]] ^ (h"i) = A. 279 Corollary 5.2 The least (or greatest) A such that [ ] ` b : A is (h"i) for the least (or greatest) such that j= E [[b : " : b]]. We have now transformed our inference problem to a form where the standard algorithmic techniques can be exploited. These include simpli cations of the equation system, partitioning the equation system into strongly connected components processed in (reverse) topological order, widening to ensure convergence when Abs does not have nite height etc.; a good overview of useful techniques may be found in [2, 7, 10, 17]. Also the ow variables may be decomposed to families of ow variables over simpler value spaces. 6 Static Processor Allocation The idea behind the static processor allocation is that all processes with the same label will be placed on the same processor and we would therefore like to know what requirements this puts on the processor. To obtain such information we shall extend the simple counting analysis of Section 4 to associate information with the process labels mentioned in a given behaviour b. For each process label La we therefore ask the four questions of Section 4 accumulating the total information for all processes with label La: how many times are channels labelled by L created, how many times do channels labelled by L participate in input, how many times do channels labelled by L participate in output, and how many times are processes labelled by L generated? Example 6.1 Let us return to the pipe function of Examples 2.1 and 2.2 and suppose that we want to perform static processor allocation. This means that all instances of the processes labelled will reside on the same processor. The analysis should therefore estimate the total requirements of these processes as follows:main: L1: m channels created : L1: m inputs performed : m processes created L2: m outputs performed L: m outputs performed Note that even though each process labelled by can only communicate once over L we can generate many such processes and their combined behaviour is to communicate many times over L. It follows from this analysis that the main program does not in itself communicate over L2 or L and that the processes do not by themselves spawn new processes. Now suppose we have a network of processors that may be explained graphically as follows: 280 &% '$ &% '$ &% '$ P2 P3 P1 @@@@@ One way to place our processes is to place the main program on P1 and all the processes labelled on P2. This requires support for multitasking on P2 and for multiplexing (over L1) on P1 and P2. The analysis (speci ed in Table 3) is obtained by modifying the inference system of Section 4 to have formulae benv ` b : A & P where A 2 LabSet !f Abs as before and the new ingredient is P : LabSet !f (LabSet!f Abs) The idea is that if some process is labelled La then P (La) describes the total requirements of all processes labelled by La. The behaviour environment benv is an extension of that of Section 4 in that it associates pairs A & P with the behaviour variables. Note that in the rule for forkL we have removed the \ A" component from the local e ect; instead it is incorporated in the global e ect for L. To express the correctness of the analysis we need to keep track of the relationship between the process identi ers and the associated labels. So let penv be a mapping from process identi ers to elements La of LabSet. We shall say that penv respects the derivation sequence PB =)a1 ps1 : : : =)ak psk PB0 if whenever (ai; psi) have the form (forkL b; (pi1; pi2)) then penv(pi2) = L; this ensures that the newly created process (pi2) indeed has a label (in L) as reported by the semantics. We can now rede ne the function COUNT of Section 4. Given a list X of pairs of actions and lists of process identi ers de ne COUNTpenv(X ) = La: L:(CCLa(X ; L); CILa(X ; L); COLa(X ; L); CFLa(X ; L)) CCLa(X ; L): the number of elements of the form (t chanL; pi) in X where penv(pi) = La, CILa(X ; L): the number of elements of the form (L0!t?L; (pi0; pi)) in X , where penv(pi) = La, COLa(X ; L): the number of elements of the form (L!t?L0; (pi; pi0)) in X , where penv(pi) = La, and CFLa(X ; L): the number of elements of the form (forkLb; (pi; pi0)) in X where penv(pi) = La. Soundness of the analysis then amounts to: 281 benv ` : [ ] & [ ] benv ` L!t : [L 7! (o;o; i;o)] & [ ] benv ` L?t : [L 7! (o; i;o;o)] & [ ] benv ` t chanL : [L 7! (i;o;o;o)] & [ ] benv ` b : A & P benv ` forkL b : [L 7! (o;o;o; i)] & ([L 7! A] P ) benv ` b1 : A1 & P1 benv ` b2 : A2 & P2 benv ` b1; b2 : A1 A2 & P1 P2 benv ` b1 : A1 & P1 benv ` b2 : A2 & P2 benv ` b1 + b2 : A1tA2 & P1tP2 benv[ 7! A & P ] ` b : A & P benv ` rec : b : A & P benv ` : A & P if benv( ) = A & P Table 3: Analysis for static process allocation Theorem 6.2 Assume that ; ` b : A & P and [pi0 7! b] =)a1 ps1 : : : =)ak psk PB and let penv be a mapping from process identi ers to elements of LabSet respecting the above derivation sequence and such that penv(pi0) = L0. We then have R (COUNTpenv[(a1; ps1); ; (ak; psk)]) v (P [L0 7! A]) where R (C)(La)(L) = (R(c);R(i);R(o);R(f)) if C(La)(L) = (c; i; o; f). Note that the lefthand side of the inequality counts the number of operations for all processes whose labels is given (by La); hence our information is useful for static processor allocation. To obtain an e cient implementation of the analysis it is once more profitable to generate an equation system. This is hardly any di erent from the approach of Section 5 except that by now there is even greater scope for decomposing the ow variables into families of ow variables over simpler value spaces. 7 Dynamic Processor Allocation The idea behind the dynamic processor allocation is that the decision of how to place processes on processors is taken dynamically. Again we will be interested in knowing which requirements this puts on the processor but in contrast to the previous section we are only concerned with a single process rather than all processes with a given label. We shall now modify the analysis of Section 6 to associate worst-case information with the process labels rather than accumulating the total information. For each process label La we therefore ask the four 282 benv ` b : A & P benv ` forkL b : [L 7! (o;o;o; i)] & ([L 7! A]tP ) benv ` b1 : A1 & P1 benv ` b2 : A2 & P2 benv ` b1; b2 : A1 A2 & P1tP2 benv[ 7! A] ` b : A & P benv ` rec : b : A & P benv ` : A & [ ] if benv( ) = A Table 4: Analysis for dynamic process allocation questions of Section 4 taking the maximum information over all processes with label La: how many times are channels labelled by L created, how many times do channels labelled by L participate in input, how many times do channels labelled by L participate in output, and how many times are processes labelled by L generated? Example 7.1 Let us return to the pipe function of Examples 2.1 and 2.2 and suppose that we want to perform dynamic processor allocation. This means that all the processes labelled need not reside on the same processor. The analysis should therefore estimate the maximal requirements of the instances of these processes as follows: main: L1: m channels created : L1: m inputs performed : m processes created L2: m outputs performed L: i output performed Note that now we do record that each individual process labelled by actually only communicates over L at most once. Returning to the processor network of Example 6.1 we may allocate the main program on P1 and the remaining processes on P2 and P3 (and possibly P1 as well): say f1 and f3 on P2 and f2 and id on P3. Facilities for multitasking are needed on P2 and P3 and facilities for multiplexing on all of P1, P2 and P3.The inference system still has formulae benv ` b : A & P where A and P are as in Section 6 and now benv is as in Section 4: it does not incorporate the P component2. Most of the axioms and rules are as in Table 3; the modi cations are listed in Table 4. A di erence from Section 6 is that now we need to keep track of the individual process identi ers. We therefore rede ne the function COUNTpenv as follows: 2It could be as in Section 6 as well because we now combine P components using t rather than . 283 COUNTpenv(X ) = La: L:((CCPI(X ; L); CIPI(X ; L); COPI(X ; L); CFPI(X ; L))where PI = penv1(La))CCPI(X ; L): the maximum over all pi 2 PI of the number of elementsof the form (t chanL; pi) in X ,CIPI(X ; L): the maximum over all pi 2 PI of the number of elementsof the form (L0!t?L; (pi0; pi)) in X ,COPI(X ; L): the maximum over all pi 2 PI of the number of elementsof the form (L!t?L0; (pi; pi0)) in X , andCFPI(X ; L): the maximum over all pi 2 PI of the number of elementsof the form (forkLb; (pi; pi0)) in X .Soundness of the analysis then amounts to:Theorem 7.2 Assume that ; ` b : A & P and [pi0 7! b] =)a1ps1 : : : =)akpskPB and let penv be a mapping from process identi ers to elements of LabSetrespecting the above derivation sequence and such that penv(pi0) = L0. Wethen haveR(COUNTpenv[(a1; ps1); ; (ak; psk)]) v (Pt[L0 7! A])where R is as in Theorem 6.2.Note that the lefthand side of the inequality gives the maximum number ofoperations over all processes with a given label; hence our information is usefulfor dynamic processor allocation.To obtain an e cient implementation of the analysis it is once more prof-itable to generate an equation system and the remarks at the end of the previoussection still apply.8 ConclusionThe speci cations of the analyses for static and dynamic allocation have much incommon; the major di erence of course being that for static processor allocationwe accumulate the total numbers whereas for dynamic processor allocation wecalculate the maximum; a minor di erence being that for the static analysis itwas crucial to let behaviour environments include the P component whereasfor the dynamic analysis this was hardly of any importance.This di erence in approach is reminiscent of the di erence between theformulation of MFP-style and MOP-style analyses: in the former the e ects ofpaths (corresponding to process identi ers with the same label set) are mergedalong the way whereas in the latter the paths (corresponding to the processidenti ers) have to be kept separate and their e ects can only be merged whenthe propagation of e ects has taken place.Acknowledgements. We would like to thank Torben Amtoft for many in-teresting discussions. This research has been funded in part by the LOMAPS(ESPRIT BRA) and DART (Danish Science Research Council) projects.284 References[1] T.Amtoft, F.Nielson, H.R.Nielson: Type and behaviour reconstruction for higher-order concurrent programs. This proceedings.[2] J.Cai, R.Paige: Program Derivation by Fixed Point Computation. Science ofComputer Programming 11, pp. 197{261, 1989.[3] R. Cridlig, E.Goubault: Semantics and analysis of Linda-based languages. Proc.Static Analysis, Springer Lecture Notes in Computer Science 724, 1993.[4] C.E.McDowell: A practical algorithm for static analysis of parallel programs.Journal of parallel and distributed computing 6, 1989.[5] A.Giacalone, P.Mishra, S.Prasad: Operational and Algebraic Semantics for Facile:a Symmetric Integration of Concurrent and Functional Programming. Proc.ICALP'90, Springer Lecture Notes in Computer Science 443, 1990.[6] K.Havelund, K.G.Larsen: The Fork Calculus. Proc. ICALP'93, Springer LectureNotes in Computer Science 700, 1993.[7] M.S.Hecht: Flow Analysis of Computer Programs, North-Holland, 1977.[8] Y.-C.Hung, G.-H.Chen: Reverse reachability analysis: a new technique for dead-lock detection on communicating nite state machines. Software | Practice andExperience 23, 1993.[9] S.Jagannathan, S.Week: Analysing stores and references in a parallel symboliclanguage. Proc. L&FP;, 1994.[10] M.Jourdan, D.Parigot: Techniques for Improving Grammar Flow Analysis. Proc.ESOP'90, Springer Lecture Notes in Computer Science 432, pp. 240{255, 1990.[11] N. Mercouro : An algorithm for analysing communicating processes. Proc. ofMFPS, Springer Lecture Notes in Computer Science 598, 1992.[12] F.Nielson, H.R.Nielson: From CML to Process Algebras. Proc. CONCUR'93,Springer Lecture Notes in Computer Science 715, 1993.[13] H.R.Nielson, F.Nielson: Higher-Order Concurrent Programs with Finite Commu-nication Topology. Proc. POPL'94, pp. 84{97, ACM Press, 1994.[14] F.Nielson, H.R.Nielson: Constraints for Polymorphic Behaviours for ConcurrentML. Proc. CCL'94, Springer Lecture Notes in Computer Science 845, 1994.[15] J.H.Reif, S.A.Smolka: Data ow analysis of distributed communicating processes.International Journal of Parallel Programs 19, 1990.[16] J.R.Reppy: Concurrent ML: Design, Application and Semantics. Springer LectureNotes in Computer Science 693, pp. 165{198, 1993.[17] R.Tarjan: Iterative Algorithms for Global Flow Analysis. In J.Traub (ed.), Algo-rithms and Complexity, pp. 91{102, Academic Press, 1976.[18] B.Thomsen. Personal communication, May 1994.285
منابع مشابه
Generating Efficient Code for Lazy Functional Languages
In this paper we will discuss how a good code generator can be built for (lazy) functional languages. Starting from Concurrent Clean, an experimental lazy functional programming language, code is generated for an intermediate abstract machine: the ABC machine. In this rst pass many well-known optimisation techniques are included. However, we will also present some new ideas in this area, like t...
متن کاملEvaluating the importance of dynamic allocation and routing of rescuers in reducing response time
Due to delay in receiving emergency medical services, a high number of injured people and patients annually lose their lives. Determining the medical service area and correct routing of rescuing operation is influential on the reduction of rescuers’ response time. Changing the traffic flow leads to change of medical service area. Therefore, it is expected that by observing changing traffic, the...
متن کاملInterprocedural Specialization of Higher-Order Dynamic Languages Without Static Analysis
Function duplication is widely used by JIT compilers to efficiently implement dynamic languages. When the source language supports higher order functions, the called function’s identity is not generally known when compiling a call site, thus limiting the use of function duplication. This paper presents a JIT compilation technique enabling function duplication in the presence of higher order fun...
متن کاملMessage Analysis for Concurrent Languages
We describe an analysis-driven storage allocation scheme for concurrent languages that use message passing with copying semantics. The basic principle is that in such a language, data which is not part of any message does not need to be allocated in a shared data area. This allows for deallocation of threadspecific data without requiring global synchronization and often without even triggering ...
متن کاملMemory Conscious Scheduling and Processor Allocation on NUMA Architrchitectures
Operating system abstractions do not always meet the needs of a language or applications designer. A lack of efficiency and functionality in scheduling mechanisms can be filled by an application-specific runtime environment providing mechanisms for dynamic processor allocation and memory conscious scheduling. We believe that a synergistic approach that involves three components, the operating s...
متن کاملDynamic Processor Allocation in Large Mesh-Connected Multicomputers
Current processor allocation techniques for highly parallel systems are based on centralized front-end based algorithms. As a result, the applied strategies are restricted to static allocation, low parallelism and weak fault tolerance. To lift these restrictions we are investigating a distributed approach to the processor allocation problem in large meshconnected multicomputers. A noncontiguous...
متن کامل